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This paper contributes to the discourse in stochastic education of how young students 
deal with learning settings that allow a data-based approach to probability. By using 
the micro-structure of arguments by Toulmin (1958), it explores which arguments 
students use and which role they play in the learning process. The data stems from 
design experiments with students at the beginning of their stochastic career (aged I1 to 
13) and is analysed with an interpretative approach. 


THEORETICAL BACKGROUND 
Integration of theoretical and experimental approach to probability 


There are several perspectives on probability, two of which will be taken into account 
here (Fig 1): the so-called ‘classical’ approach focusses on calculating probabilities 
theoretically, for instance by determining the ratio of outcomes favourable to the 
number of outcomes unfavourable of an event when all cases are equally likely (Jones 
et al. 2007, p. 912). The ‘experimental’ (or “frequentist’) approach is more centred on 
data: “the probability of an event is defined as the ratio of the number of trials 
favourable to the event to the total number of trials” (Jones et al. 2007). 
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Figure 1: Interplay of theoretical probability and relative frequency. 


Coming from theoretical considerations, trends in data collected via experiments with 
random devices (such as dice) can be predicted. Or, having analysed data first, the 
relative frequencies can serve as estimation for the probability distribution underlying 
the experiment. The estimation and prediction will become better the more often the 
experiment is repeated. This is the definition Moore (1990) uses for the term ‘random’: 
“Phenomena having uncertain individual outcomes but a regular pattern of outcomes 
in many repetitions are called random. ‘Random’ is not a synonym for ‘haphazard’ but 
a description of a kind of order different from the deterministic one that 1s popularly 
associated with science and mathematics” (p. 98 emphasis in original). 


Open to question is how students gain and integrate understanding of these two 
perspectives (Jones et al., 2007, p. 946). 
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Reasoning and arguments in stochastics 


To investigate this, this paper focusses on the activity of reasoning as it gives insights 
into students’ sense-making processes. ‘Informal inferential reasoning’ was first used 
in statistics, a term which in probability education refers to exploring and making 
inferences about trends in data generated by random devices such as dice, without 
explicitly using probability or statistical terms (Pratt et al., 2008). 


In this paper, ‘reasoning’ is understood as using arguments in interaction. To get 
further insight, the micro-structure of arguments proposed by Toulmin (1958) is 
useful: Reasoning is finding arguments which are a series of propositions in which a 
Claim is inferred from Evidence. The so-called Warrant links the Evidence and 
Claims, for instance the fictive argument in Fig 2. The triple is called ‘argument’. 


Evidence Claim 
Of 2000 throws with a The die is unfair. 
six-sided die, 1000 are a “6”. 


Warrant 


When an event appears too 


often, the die is unfair. 


Figure 2: Micro-structure of argument and fictive example. 


The example in Figure 2 links a data-focussed Evidence to a Claim about the 
probability distribution underlying the experiment. Due to page limitations, further 
elements of arguments such as the Backing will not be addressed in this paper (cf. 
Toulmin 1958). 


Applying this model to arguments in a stochastic setting adds a specific reasoning that 
refers to the random variation of data: “To understand the nature of statistical 
argument, we must consider what types of explanation qualify as answer to why 
questions. [...] Indeed, statistical inference is rare among scientific logics in being 
forced to deal with chance explanations as alternatives or additions to systematic 
explanations” (Abelson, 1995, p. 6). 


Types of arguments when dealing with an experiment-based approach to 
probability 


Research in statistics and probability education has uncovered different conceptions 
and perspectives when dealing with probability that can lead to different arguments. 
The here presented types of arguments are not supposed to be disjunct; instead the 
analysis below will show that more complex arguments link different types together. 


e Data-centred arguments: The Evidence is an observation about the data; for 
instance, means of data analysis are applied to identify central trends which 
can then be inferred as a claim. 
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e §=Theoretical arguments: In this case, the Evidence refers to probabilities (e.g. 
by determining the ratio of favourable to non-favourable outcomes) and 
probability distributions from which a Claim then is made, e.g. about 
expected trends in the data. 


e Non-deterministic arguments: As Moore (1990) points out, ‘random’ 
describes a non-deterministic order. Students might lack words for this, but 
are sometimes able to find conceptions such as ‘unpredictability’ (cf. Pratt 
2008). These can be used as base for a Claim. 


e = (Quasi-)causal arguments: In this last type of argument, the Evidence is used 
as a cause to make a Claim why a certain phenomenon occurred. Learners 
might try to find causal explanations for the result of a single throw of a die, 
which according to Konold (1989) is a common misunderstanding when 
dealing with probabilities: Instead of applying stochastic reasoning, people 
perceive “the goal in dealing with uncertainty [as] to predict the outcome of a 
single next trial” (p. 61). The (quasi)-causal explanations are often ‘magical’ 
(e.g. an animistic nature of the chance device, see Wollring, 1994) or refer to 
causes that can be manipulated by the students (Wollring 1994). These 
arguments are important elements of the learners’ process of making sense of 
the interplay of uncertain individual outcomes and regular pattern in the long 
term perspective (Pratt et al. 2008 and Wollring 1994). 


RESEARCH QUESTIONS AND DESIGN 


The insights presented here are part of a broader project to investigate students’ 
processes of constructing knowledge when confronted with an experiment-based 
approach to probability (Schnell, 2014). In this paper, the specific focus lies on the 
different types of arguments in order to gain deeper insight into the processes of 
integrating theoretical and data-centred notions of probability: 


1. Which (quasi-)causal and non-deterministic explanations do students use and 
which role do they play in the learning processes? 

2. (How) Do students integrate theoretical and experimental aspects of probability 
in their arguments? 


The teaching-learning-arrangement 


To investigate this, a teaching-learning-arrangement was used that works as an 
experiment-based, informal introduction to probability in grade 6 or 7. The core 
elements are two consecutive games in which students gain points by betting on the 
results of a race between four differently-coloured animals. The didactic intention is to 
provide systematic experience with the empirical law of large numbers. This paper will 
focus only on the first game (for more details see Prediger & HuBmann, 2014 and 
Schnell, 2014). 


At the core of the game is the repeated throwing of a 20-sided die with the following 
colour distribution and the corresponding animals: red ant 7 sides, green frog 5, yellow 
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snail 5 and blue hedgehog 3. While students have access to the die at all times in the 
teaching-learning-arrangement, experience shows that most of them assume an even 
colour distribution at first and discover the actual colour distribution later on. 


The length of the race (i.e. how many times the die 1s rolled in total) is set before the 
game start: every number between | and 10,000 is possible; longer games take place 
using a computer simulation. Each throw of the die moves the corresponding animal 
one step forward on the game board or in the computer simulation; the race is finished 
and the results are compared when the previously determined number of throws is 
reached. The animal with the highest absolute frequency is the winner of the race. 
Motivation for further investigation of the data are the questions “which animal is the 
best” and “when can you be as sure as possible that this animal will win’. To 
systematically compare the results of short races (e.g. 1, 10 or 20 throws) with each 
other but also with the results of long races (e.g. 100, 1000, 10 000 throws), the 
teaching-learning-arrangement provides record sheets and tasks focusing on these 
comparisons. 


Methods 


The author conducted design experiments (Cobb et al., 2003) with nine pairs of 
students (grade 6, German comprehensive school, ages 11 to 13) in a laboratory 
setting. Each design experiment took across four to six sessions of 60 to 90 minutes 
each; the game described above was finished within the first session for eight pairs and 
for one pair (Emily and Leo) within the second session. The data corpus includes 
videos, screen captures of the simulation, transcripts and all written products such as 
record sheets. 


The research questions were addressed by qualitatively analysing the transcripts and 
videos turn by turn. In a first step, all arguments were identified, 1.e. all statements in 
which Claims and Evidence were explicitly stated and connected by a(n implicit) 
Warrant. This paper focusses on arguments with Claims about observations related to 
data or theoretical aspects (leaving out other arguments, for instance about the quality 
of a prediction; cf. Schnell 2014 for a broader investigation); in total 49 arguments 
were identified’. Then, these arguments were coded and categorized in terms of the 
type of argument and Evidence. Selected results of the analysis are presented here. 


ANALYSIS AND DISCUSSION OF DIFFERENT TYPES OF REASONING 


The role of (quasi-)causal and non-deterministic arguments 


(Quasi-)causal arguments: This category shows the most variety in the analysis which 
is in accordance with the literature (Jones et al. 2007). 17 of the 49 arguments can be 


' This number refers to arguments that were newly constructed in the course of the design 
experiments. Not included are numbers for when students repeated a previously used argument (e.g. 
repeating the superiority of the red ant because of the colour distribution). If two different pairs of 
students construct the same argument, both are counted. 
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coded as (quasi)-causal. All arguments with one exception are constructed before the 
discovery of the colour distribution. By looking more closely at the Evidence, 
subcategories can be built. Some of these subcategories are: 


e = Device-focussed: Students try to find causes for outcomes by focussing on the 
die (or the computer simulation, but no participant did that), such as <When 
the die is manipulated, you get an unwanted outcome> (RS-24:35)* or 
animistic conceptions such as <When the die is evil, you get an unwanted 
outcome> (DJu8- 31:55). Some students use the physics of throwing the die 
as a cause for the result. For one pair (Ramona and Sarah), this argument 
dominates the first 30 minutes: <When the die rolls for a short distance, the 
outcome is blue> (RS-22:39). All these arguments focus on explaining 
outcomes of a single throw of the die. 


e  Property-focussed: This subcategory includes two arguments that claim the 
superiority of the red ant comes from the colour red itself or the specific 
animal: <When the colour is red, then the animal is on fire and is thus the 
fastest> (EL-41:42) and <When an animal has long legs, it wins more often> 
(DeK-87:50). The latter argument is the only causal argument that is built 
after the colour distribution is discovered. 


Non-deterministic arguments: Only two arguments could be identified as solely 
non-deterministic: They are both created by the same pair of students and use the 
concept ‘luck’ as Evidence, for instance: <When bad luck happens, then a series of red 
is ended by green> (EL-43:38). Both were also constructed before the colour 
distribution of the die was discovered. Even though other students also refer to good or 
bad luck, they are not using it explicitly as Evidence for a Claim. In three other cases, 
non-deterministic Evidence is combined with theoretical insights; these are discussed 
below. 


Addressing the first research question of the role of (quasi-)causal explanations: 
Looking at the overall picture, six out of nine pairs of students built (quasi-)causal and 
non-deterministic explanations. The variety of different explanations is in line with 
findings in literature (cf. Jones et al. 2007). The device-focussed arguments are 
concermed with not only explaining single outcomes, but also with undesired results. 
This might indicate that students tend to deal with experiences that are opposed to 
expectations by building these explanations. This observation is in line with 
Wollring’s (1994, p. 136) observations about the behaviour of children in primary 
school. Property-focussed arguments are used to explain the superiority of the red 
animal, rather than single outcomes. In all but one cases, these (quasi-)causal 
arguments are built before the discovery of the colour distribution of the die. 


* Noted in < > is the reconstructed Warrant of the form ‘When Evidence, then Conclusion’; in 
parentheses is the code for the pair of students and the time-stamp in which the argument was 
verbalised for the first time. 
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The role of data-centred and theoretical arguments 


Data-centred (8 arguments of 49): Due to the guide question “which coloured animal 
is the best” and the provision of record sheets, all students focussed on the produced 
data. The arguments here are those, in which students explicitly used a data-centred 
Evidence to make a Claim, such as <When red ant has won most often, it is the best 
animal>. They were built and mainly used before the colour distribution was 
discovered and seemed to disappear afterwards. 


Theoretical arguments (11 arguments): The discovery of the colour distribution of the 
die is crucial for giving meaning to the patterns in data such as the red ant being more 
likely to win. Thus, students who don’t discover the colour distribution by themselves 
are prompted by the research teacher (two pairs). Therefore, all pairs of students use 
arguments like <When red has more faces than the other colours, the red animal is 
more likely to be rolled> (appears for all pairs of students). 


Combination of theoretical and data-centred (8 arguments): The data analysis shows 
that theoretical- and data-centred arguments were combined in some cases, for instance 
<When red is more on the die, red is rolled more often and thus red ant wins more 
often> (EL2-27:10). This argument uses theoretical Evidence to claim the empirically 
observed superiority of the red ant. Furthermore, it might refer to the connection 
between the red ant winning a whole race (i.e. having the highest absolute frequency) 
and the single outcome of one throw of the die. 


Combination of theoretical and non-deterministic (3 arguments): Three of these 
combined arguments could be identified: <When blue hedgehog 1s lucky, it gets the 
three blue faces very often and can win> (RS-30:10), <When red ant is unlucky, it 
loses in races with an even number of throws even though it is superior> (RS-67:16) 
and <When you are lucky, an animal with fewer chances wins> (RS-41:15). Here, 
Ramona and Sarah start with (good/bad) luck as Evidence and make a Claim that this 
might interfere with the chances derived theoretically from the colour distribution. 
This could be interpreted as the integration of experienced random variation in single 
outcomes with data-based and theoretical insights. 


Addressing the second research question concerning the integration of data-centred 
and theoretical aspects of probability: 


Some arguments could be identified which combine theoretical and data-based 
aspects. Here, patterns (superior red ant) are related to the colour distribution (7 out of 
20 sides on the die). In these arguments, the theoretical insight serves as Evidence to 
make a claim related to data-centred observations. Looking at the sequence in which 
the different arguments were built, it is noticeable that after the discovery of the colour 
distribution, no new, solely data-centred arguments were built. One pair of students 
also combines theoretical and non-deterministic insights to explain situations in which 
it encounters random variation (e.g. a losing red ant). A variety of micro-processes of 
combining different insights were investigated in Schnell & Prediger (2012). 
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CONCLUDING REMARKS 


This paper gives a short insight into the types of arguments that students at the 
beginning of secondary school use when working on an experiment-based setting 
introducing probability. The in-depth analysis shows how they not only make 
connections between theoretical and data-centred aspects, but also integrate 
non-deterministic arguments in a meaningful way. This supports the claim that 
informal conceptions are important for individual learning pathways (Pratt et al. 2008; 
Schnell 2014). 


Another observation is that the discovery of the colour distribution seems to lead to a 
decline in (quasi-)causal and solely data-centred arguments. This raises the question of 
whether there is some kind of implicit hierarchy between the different types of 
arguments. The presented data suggests that students might be aware of a superiority of 
theoretical arguments over other types of arguments. To uncover the relations between 
different types of arguments, it might be fruitful to take into account further elements 
of arguments such as the Backing for the Warrant and Rebuttals (Toulmin, 1958). 
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